Model Selection

Efficient Training

# Efficient Training

nanoVLM is a lightweight vision-language model (VLM) designed for efficient training and experimentation.

nanoVLM is a lightweight vision-language model (VLM) designed for efficient training and experimentation.

Qwen2.5 Coder 7B NEP Fix

A text generation and inference model optimized using Unsloth and TRL libraries based on the Qwen/Qwen2.5-Coder-7B model, achieving 2x faster training speed

Large Language Model

Transformers English

Bonsai is a small ternary-weighted language model with 500 million parameters, built on the Llama architecture and using the Mistral tokenizer, trained on fewer than 5 billion tokens.

Large Language Model

RWKV7 Goose Pile 168M HF

RWKV-7 model using Flash Linear Attention format, trained on the Pile dataset, supporting English text generation tasks.

Large Language Model

Transformers English

TraceBack 12b is a 4bit quantized version based on the Mistral-Nemo-Instruct architecture, focusing on instruction-following and chain-of-thought reasoning tasks.

Large Language Model

This is a speech language model based on discrete Hubert tokens, focusing on efficient training and capable of generating speech segment continuations.

Audio Generation

Open Reasoner Zero 7B

Open Reasoner Zero is an open-source solution for large-scale reinforcement learning based on foundational models, focusing on scalability, simplicity, and ease of use for large-scale reasoning-oriented reinforcement learning.

Large Language Model

Open-Reasoner-Zero

Deepseek R1 Distill Llama 8B Finance V1

This is a financial domain language model fine-tuned based on the DeepSeek-R1-Distill-Llama-8B model, optimized using LoRA technology, suitable for financial Q&A and instruction tasks.

Large Language Model

Transformers English

Llama 3.2 11B Vision Radiology Mini

Vision instruction fine-tuned model optimized with Unsloth, supporting multimodal task processing

Transformers English

Llama 3 Instruct 8B SimPO

SimPO is a preference optimization method that eliminates the need for reference reward models, simplifying the traditional RLHF pipeline by directly optimizing language models with preference data.

Large Language Model

Mistral-SUPRA is a linear RNN model initialized based on Mistral-7B, combining the functions of Transformer and recurrent models.

Large Language Model

PyTorch English

Moe LLaVA Qwen 1.8B 4e

MoE-LLaVA is a large vision-language model based on the Mixture of Experts architecture, achieving efficient multimodal learning through sparse activation parameters

Is New Dataset Teacher Model

A few-shot learning text classification model based on the SetFit framework, achieving efficient classification through contrastive learning and classification head training

Text Classification

Godot Dodo 4x 60k Llama 13b

Godot-Dodo is an instruction-following model fine-tuned from LLaMA 13B, specializing in code instruction understanding and generation tasks

Large Language Model

An image classification model provided by Keras, supporting multiple pre-trained architectures and suitable for common image classification tasks.

Image Classification

Ppo Pendulum V1

This is a reinforcement learning model based on the PPO algorithm, designed to solve control problems in the Pendulum-v1 environment.

Distilbert Dot Tas B B256 Msmarco

A dual-encoder dot-product scoring architecture based on DistilBert, trained on the MSMARCO-Passage dataset with balanced topic-aware sampling, suitable for dense retrieval and candidate set re-ranking

Transformers English

sebastian-hofstaetter

Deit Base Patch16 224

DeiT is a data-efficient image Transformer model trained with attention mechanisms, pretrained and fine-tuned on the ImageNet-1k dataset at 224x224 resolution.

Image Classification

Bert Mini Finetuned Squadv2

This model is based on the BERT-mini architecture, fine-tuned on the SQuAD 2.0 dataset using the M-FAC second-order optimizer for question answering tasks.

Question Answering System

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase